RCrawler: An R package for parallel web crawling and scraping

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Extended Model for Effective Migrating Parallel Web Crawling with Domain Specific and Incremental Crawling

The size of the internet is large and it had grown enormously search engines are the tools for Web site navigation and search. Search engines maintain indices for web documents and provide search facilities by continuously downloading Web pages for processing. This process of downloading web pages is known as web crawling. In this paper we propose the architecture for Effective Migrating Parall...

متن کامل

Data-Parallel Web Crawling Models

The need to quickly locate, gather, and store the vast amount of material in the Web necessitates parallel computing. In this paper, we propose two models, based on multi-constraint graph-partitioning, for efficient data-parallel Web crawling. The models aim to balance the amount of data downloaded and stored by each processor as well as balancing the number of page requests made by the process...

متن کامل

An extended model for effective migrating parallel web crawling with domain specific crawling

The size of the internet is large and it had grown enormously search engines are the tools for Web site navigation and search. Search engines maintain indices for web documents and provide search facilities by continuously downloading Web pages for processing. This process of downloading web pages is known as web crawling. In this paper we propose the architecture for Effective Migrating Parall...

متن کامل

haploR: an R package for querying web-based annotation tools

We developed haploR, an R package for querying web based genome annotation tools HaploReg and RegulomeDB. haploR gathers information in a data frame which is suitable for downstream bioinformatic analyses. This will facilitate post-genome wide association studies streamline analysis for rapid discovery and interpretation of genetic associations.

متن کامل

haploR: an R-package for querying web-based annotation tools

There exists a set of web-based tools for integration and exploring information linked to annotated genetic variants. We developed , an R-package for haploR querying such web-based genome annotation tools (currently implementing on HaploReg and RegulomeDB) and gathering information in a format suitable for downstream bioinformatic analyses. This will facilitate post-genome wide association stu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: SoftwareX

سال: 2017

ISSN: 2352-7110

DOI: 10.1016/j.softx.2017.04.004